skip to main content


Search for: All records

Creators/Authors contains: "Yao, T"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Host-managed shingled magnetic recording drives (HMSMR) give a capacity advantage to harness the explosive growth of data. Applications where data is sequentially written and randomly read, such as key-value stores based on Log-Structured Merge Trees (LSM-trees), make the HMSMR an ideal solution due to its capacity, predictable performance, and economical cost. However, building an LSMtree based KV store on HM-SMR drives presents severe challenges in maintaining the performance and space efficiency due to the redundant cleaning processes for applications and storage devices (i.e., compaction and garbage collections). To eliminate the overhead of on-disk garbage collections (GC) and improve compaction efficiency, this paper presents GearDB, a GC-free KV store tailored for HMSMR drives. GearDB proposes three new techniques: a new on-disk data layout, compaction windows, and a novel gear compaction algorithm. We implement and evaluate GearDB with LevelDB on a real HM-SMR drive. Our extensive experiments have shown that GearDB achieves both good performance and space efficiency, i.e., on average 1:71 faster than LevelDB in random write with a space efficiency of 89.9%. 
    more » « less
  2. Key-value (KV) stores play an increasingly critical role in supporting diverse large-scale applications in modern data centers hosting terabytes of KV items which even might reside on a single server due to virtualization purpose. The combination of ever growing volume of KV items and storage/application consolidation is driving a trend of high storage density for KV stores. Shingled Magnetic Recording (SMR) represents a promising technology for increasing disk capacity, but it comes at a cost of poor random write performance and severe I/O amplification. Applications/software working with SMR devices need to be designed and optimized in an SMR-friendly manner. In this work, we present SEALDB, a Log-Structured Merge tree (LSM-tree) based key-value store that is specifically op- timized for and works well with SMR drives via adequately addressing the poor random writes and severe I/O amplification issues. First, for LSM-trees, SEALDB concatenates SSTables of each compaction, and groups them into sets. Taking sets as the basic unit for compactions, SEALDB improves compaction efficiency by mitigating random I/Os. Second, SEALDB creates varying size bands on HM-SMR drives, named dynamic bands. Dynamic bands not only accommodate the storage of sets, but also eliminate the auxiliary write amplification from SMR drives. We demonstrate the advantages of SEALDB via extensive experiments in various workloads. Overall, SEALDB delivers impressive performance improvement. Compared with LevelDB, SEALDB is 3.42× faster on random load due to improved compaction efficiency and eliminated auxiliary write amplification on SMR drives. 
    more » « less
  3. Pascual, Mercedes (Ed.)
    When Darwin visited the Galapagos archipelago, he observed that, in spite of the islands’ physical similarity, members of species that had dispersed to them recently were beginning to diverge from each other. He postulated that these divergences must have resulted primarily from interactions with sets of other species that had also diverged across these otherwise similar islands. By extrapolation, if Darwin is correct, such complex interactions must be driving species divergences across all ecosystems. However, many current general ecological theories that predict observed distributions of species in ecosystems do not take the details of between-species interactions into account. Here we quantify, in sixteen forest diversity plots (FDPs) worldwide, highly significant negative density-dependent (NDD) components of both conspecific and heterospecific between-tree interactions that affect the trees’ distributions, growth, recruitment, and mortality. These interactions decline smoothly in significance with increasing physical distance between trees. They also tend to decline in significance with increasing phylogenetic distance between the trees, but each FDP exhibits its own unique pattern of exceptions to this overall decline. Unique patterns of between-species interactions in ecosystems, of the general type that Darwin postulated, are likely to have contributed to the exceptions. We test the power of our null-model method by using a deliberately modified data set, and show that the method easily identifies the modifications. We examine how some of the exceptions, at the Wind River (USA) FDP, reveal new details of a known allelopathic effect of one of the Wind River gymnosperm species. Finally, we explore how similar analyses can be used to investigate details of many types of interactions in these complex ecosystems, and can provide clues to the evolution of these interactions. 
    more » « less